UTS: An Unbalanced Tree Search Benchmark

نویسندگان

  • Stephen Olivier
  • Jun Huan
  • Jinze Liu
  • Jan Prins
  • James Dinan
  • P. Sadayappan
  • Chau-Wen Tseng
چکیده

This paper presents an unbalanced tree search (UTS) benchmark designed to evaluate the performance and ease of programming for parallel applications requiring dynamic load balancing. We describe algorithms for building a variety of unbalanced search trees to simulate different forms of load imbalance. We created versions of UTS in two parallel languages, OpenMP and Unified Parallel C (UPC), using work stealing as the mechanism for reducing load imbalance. We benchmarked the performance of UTS on various parallel architectures, including sharedmemory systems and PC clusters. We found it simple to implement UTS in both UPC and OpenMP, due to UPC’s shared-memory abstractions. Results show that both UPC and OpenMP can support efficient dynamic load balancing on shared-memory architectures. However, UPC cannot alleviate the underlying communication costs of distributed-memory systems. Since dynamic load balancing requires intensive communication, performance portability remains difficult for applications such as UTS and performance degrades on PC clusters. By varying key work stealing parameters, we expose important tradeoffs between the granularity of load balance, the degree of parallelism, and communication costs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

X10 for Productivity and Performance at Scale

We implement all four HPC Class II benchmarks in X10: Global HPL, Global RandomAccess, EP Stream (Triad), and Global FFT. We also implement the Unbalanced Tree Search benchmark (UTS). We show performance results for these benchmarks running on an IBM Power 775 Supercomputer utilizing up to 47,040 Power7 cores. We believe that our UTS implementation demonstrates that X10 can deliver unprecedente...

متن کامل

Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs

The UTS benchmark is used to evaluate task parallelism in OpenMP 3.0 as implemented in a number of recently released compilers and run-time systems. UTS performs parallel search of an irregular and unpredictable search space, as arises e.g. in combinatorial optimization problems. As such UTS presents a highly unbalanced task graph that challenges scheduling, load balancing, termination detectio...

متن کامل

Overlay-Centric Load Balancing: Applications to UTS and B&B

To deal with dynamic load balancing in large scale distributed systems, we propose to organize computing resources following a logical peer-to-peer overlay and to distribute the load according to the so-defined overlay. We use a tree as a logical structure connecting distributed nodes and we balance the load according to the size of induced subtrees. We conduct extensive experiments involving u...

متن کامل

Effects of Network Structure Improvement on Distributed RDF Querying

In this paper, we analyze the performance of distributed RDF systems in a peer-to-peer (P2P) environment. We compare the performance of P2P networks based on Distributed Hash Tables (DHTs) and search-tree based networks. Our simulations show a performance boost of factor 2 when using search-tree based networks. This is achieved by grouping related data in branches of the tree, which tend to be ...

متن کامل

Should Static Search Trees Ever Be Unbalanced?

In this paper we study the question of whether or not a static search tree should ever be unbalanced. We present several methods to restructure an unbalanced k-ary search tree T into a new tree R that preserves many of the properties of T while having a height of log k n+1 which is one unit off of the optimal height. More specifically, we show that it is possible to ensure that the depth of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006